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Abstract 

Characterization of mosquito breeding habitats is often accomplished with the goal of guiding larval control 
interventions as well as the goal of identifying areas with higher disease risk. This characterization often relies on 
statistical measures of association (e.g., regression coefficients) between covariates and presence/absence or 
abundance of larva. Here we contend that these measures of association are not enough; researchers should also 
study the spatial and temporal distribution of water bodies. We provide recommendations on how current 
methodology may be improved to adequately take into account the distribution of water bodies. 
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Letter to the editor 

Studies on mosquito breeding sites typically survey 
water bodies to determine larval presence or abundance. 
Then, measures of association are estimated (e.g., regres- 
sion coefficients, correlation coefficients, or analysis of 
variance) and used to identify important predictors of 
larval presence, with the goal of guiding larval control 
interventions and predicting disease risk. A small sample 
of entomological studies that follow this generic recipe 
is given in the Additional file 1. While these measures 
of association are important to characterize larval habi- 
tat, here we contend that these measures may not be 
enough to guide larval control initiatives and determine 
disease risk. 

Our contention is based on the same arguments as 
those that motivated the creation of the population 
attributable fraction (PAF) concept. In the case of PAF, it 
has been argued that measures of association do not 
take into account the prevalence of the different risk 
factors. Thus, a particular risk factor might be statisti- 
cally significant but have small public health relevance if 
very few people have that risk factor [1]. Similarly, the 
risk factors associated with very productive larval 
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habitats (defined here as a water body that typically has 
larvae) might not be relevant for larval control if water 
bodies with those risk factors are rare in the overall 
landscape. 

Determining the relative abundance of water bodies is 
also critical when predicting disease risk. Researchers 
often perform their analysis given that a water body was 
sampled. In statistical terms, the typical analysis makes 
inference on the conditional probability p(L\W), where L 
and W denote the presence of larva and the event that a 
water body was sampled, respectively. On the other 
hand, to understand disease risk, inference should be 
made on the marginal probability p(L). 

To illustrate, consider the simplified example summa- 
rized in Table 1. Water bodies are sampled in a forested 
and a deforested site using the same number of transects 
per site. In scenario 1, these transects yield 30 water 
bodies (8 of which had larvae) in the forested sites and 
10 water bodies (8 of which had larvae) in the deforested 
site. As a result, the proportion of water bodies with 
larvae is p for = 8/30^0.27 and p def = 8/10 = 0.8, for 
the forested and deforested sites, respectively. Based on 
these probabilities, a logistic regression would indicate 
that forest cover is negatively associated with the pres- 
ence of the mosquito larvae and a researcher would con- 
clude that people living at forested sites have a lower 
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Table 1 Description of outcomes for scenarios 1 and 2 



Outcomes 


Scenario 1 


Scenario 2 




Forested Deforested 


Forested 


Deforested 


# of water bodies 


30 10 


30 


10 


# of water bodies with 
larva 


8 8 


15 


5 


Proportion of water 
bodies with larva 


0.3 0.8 


0.5 


0.5 



infection risk. This conclusion is incorrect, since both 
sites have the same number (i.e., 8) of water bodies with 
larvae per area. If these sites give rise to a similar num- 
ber of larvae and adult mosquitoes, and if these mosqui- 
toes have the same degree of contact with the host, then 
infection risk should be similar. Alternatively, scenario 2 
assumes that the proportion of water bodies with larvae 
is identical in both sites J?j or =^ e f =1/2 and a logistic 
regression would fail to find significant differences 
between sites, despite the fact that the forested site has 
three times more water bodies with larvae per area when 
compared to the deforested site. Although over- simplistic, 
these examples are useful to highlight that it is critical to 
identify the characteristics of productive larval sites and to 
take into account the prevalence of water bodies with 
these characteristics. For completeness, we provide exam- 
ples with simulated and real data in the Additional file 1. 

We believe it is important for researchers to carefully 
consider how the outcome of their analysis could inform 
policy actions. We re-iterate that the typical regression 
analysis assumes water bodies to be the sampling unit, 
thus yielding results per water body. If the researcher is 
primarily interested in infection risk, however, it is likely 
that the response variable more closely associated with 
infection risk is in areal unit (e.g., number of water bod- 
ies with larvae per transect). In other words, there is a 
mismatch between the analyzed outcome and the out- 
come more relevant for public health policy making. To 
avoid this mismatch, we propose two alternatives. First, 
one can directly model the number of water bodies with 
larvae per transect as a function of transect-level covari- 
ates, assuming that the sampling unit is the transect 
itself. Alternatively, one can predict the number of water 
bodies per transect to then predict how many have larvae 
(as in Additional file 1: Figure S2). Both approaches could 
also be used for fixed-area plots. In either modeling 
approach, the sampling design for water bodies is critical 
and merits careful consideration. Unfortunately, most 
studies provide detailed descriptions on how larvae were 
sampled within water bodies but not how water bodies 
themselves were sampled (e.g. [2]). 

We emphasize that using water bodies as the sampling 
unit is perfectly valid to characterize larval habitat. How- 
ever, researchers should be careful when using the derived 



measures of association to identify larval control strategies 
and predict disease risk. While dengue researchers have 
long recognized the importance of accounting for the 
abundance of water containers (e.g. [3]), we believe that 
the issues we raise have not been taken into account for 
other vector-borne diseases. Although we have focused on 
mosquito larval habitat, our results are likely to apply to 
other types of disease vectors that also rely on water 
bodies. 

Additional file 



Additional file 1: Sample of studies on mosquito breeding habitat, 
examples with real and simulated data, and description of the 
model used to analyze Anopheles darlingi data. 
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